ARM32/Thumb2: generated asm fixes#10725
Conversation
|
Code generated with PR: |
|
Jenkins: retest this please |
wolfSSL-Fenrir-bot
left a comment
There was a problem hiding this comment.
Fenrir Automated Review — PR #10725
No scan targets match the changed files in this PR. Review skipped.
There was a problem hiding this comment.
Pull request overview
This PR updates multiple ARM32/Thumb2 hand-written/generated assembly sources to improve correctness and simplify/improve instruction sequences, primarily by adjusting reduction/carry handling and making immediate operands more consistent.
Changes:
- Normalize many immediate operands (hex → decimal,
#0x0→#0, etc.) across Thumb2 SHA2/SHA3/ChaCha/Poly1305/ML-KEM codegen outputs. - Update ARM32 Curve25519 field ops to a simpler underflow/overflow handling approach and adjust
fe_isnegativeto avoid a reload by caching the low-bit early. - Optimize AES/GCM assembly (including
ubfxusage and arch<7 fallbacks) and remove a redundant register move in the AES key schedule path.
Reviewed changes
Copilot reviewed 18 out of 20 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| wolfcrypt/src/port/arm/thumb2-sha512-asm.S | Immediate-operand normalization in SHA-512 transform loop. |
| wolfcrypt/src/port/arm/thumb2-sha512-asm_c.c | Mirrors SHA-512 Thumb2 asm immediate changes in inline-asm C. |
| wolfcrypt/src/port/arm/thumb2-sha3-asm.S | Immediate-operand normalization in SHA-3 Thumb2 implementation. |
| wolfcrypt/src/port/arm/thumb2-sha3-asm_c.c | Mirrors SHA-3 Thumb2 asm immediate changes in inline-asm C. |
| wolfcrypt/src/port/arm/thumb2-sha256-asm.S | Immediate-operand normalization in SHA-256 Thumb2 implementation. |
| wolfcrypt/src/port/arm/thumb2-sha256-asm_c.c | Mirrors SHA-256 Thumb2 asm immediate changes in inline-asm C. |
| wolfcrypt/src/port/arm/thumb2-poly1305-asm.S | Immediate-operand normalization and small simplifications in Poly1305 Thumb2 asm. |
| wolfcrypt/src/port/arm/thumb2-poly1305-asm_c.c | Mirrors Poly1305 Thumb2 asm immediate changes in inline-asm C. |
| wolfcrypt/src/port/arm/thumb2-mlkem-asm.S | Immediate-operand normalization in ML-KEM Thumb2 asm loops. |
| wolfcrypt/src/port/arm/thumb2-mlkem-asm_c.c | Mirrors ML-KEM Thumb2 asm immediate changes in inline-asm C. |
| wolfcrypt/src/port/arm/thumb2-chacha-asm.S | Immediate-operand normalization in ChaCha Thumb2 asm. |
| wolfcrypt/src/port/arm/thumb2-chacha-asm_c.c | Mirrors ChaCha Thumb2 asm immediate changes in inline-asm C. |
| wolfcrypt/src/port/arm/thumb2-aes-asm.S | GCM nibble extraction simplification (ubfx) and immediate normalization; updates perf comment. |
| wolfcrypt/src/port/arm/armv8-32-curve25519.S | Simplifies underflow/overflow adjustment logic and tweaks fe_isnegative bit usage. |
| wolfcrypt/src/port/arm/armv8-32-curve25519_c.c | Mirrors Curve25519 asm changes; adds r12 clobber for updated fe_isnegative. |
| wolfcrypt/src/port/arm/armv8-32-aes-asm.S | Removes redundant move in AES key schedule; adds arch<7 fallback sequences for GCM nibble extraction. |
| wolfcrypt/src/port/arm/armv8-32-aes-asm_c.c | Mirrors AES/GCM asm changes in inline-asm C. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
douzzer
left a comment
There was a problem hiding this comment.
don't merge as-is -- see wolfssl/scripts#590
Fix Thumb2 Curve25519 asm to do full reduce. Change ARM32 to simpler carry/overflow processing. Minor optimizations - use ubfx, no need to move register into temporary, cache value instead of loading again later. Reduce the register push and pops in Thumb2 generated code. Fix Thumb2 to have values less than 64 in decimal.
8d2c23d to
9558b0d
Compare
Description
Fix Thumb2 Curve25519 asm to do full reduce.
Change ARM32 to simpler carry/overflow processing. Minor optimizations - use ubfx, no need to move register into temporary, cache value instead of loading again later. Reduce the register push and pops in Thumb2 generated code. Fix Thumb2 to have values less than 64 in decimal.
Testing
ARM32 (armv7a) and Thumb2 (armv7m) assembly configurations in QEMU.